Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize all LSP URIs #1660

Merged
merged 4 commits into from
Sep 4, 2024
Merged

Normalize all LSP URIs #1660

merged 4 commits into from
Sep 4, 2024

Conversation

msujew
Copy link
Member

@msujew msujew commented Sep 2, 2024

Addresses an issue discussed in #1619.

The main issue is that VS Code itself uses a non spec-compliant URI format in which they encode the C: to c%A3. This behavior is also mirrored in the vscode-uri package, which we use to store LangiumDocument objects.

However, language clients like LSP4E correctly send C: - which leads us down a very deep rabbit hole. In the end, the best solution to this is to write our own TextDocuments service that normalizes the incoming URIs to the same schema used by VS Code. This way, any incoming URI can find its respective document in the internal map.

Note that it seems like LSP4E can deal with our outgoing URIs just fine - it's the incoming URIs that are problematic.

@msujew msujew added the LSP Language Server Protocol integration label Sep 2, 2024
Copy link
Contributor

@borisbrodski borisbrodski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on Windows with Eclipse 2024-06-R. Works great!

@msujew msujew added this to the v3.2.0 milestone Sep 4, 2024
Copy link
Contributor

@sailingKieler sailingKieler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subclassing the existing TextDocuments service and, e.g., wrapping the Map _syncedDocuments doesn't work (or wouldn't be more elegant) as everything is private or buried within inline function expressions, correct? So I agree to copy-paste the TextDocuments service.

I assume this thing is quite stable in the vscode repo and would only have to be touched in case new document event types would be introduced by the vscode folks, right?

@msujew
Copy link
Member Author

msujew commented Sep 4, 2024

Subclassing the existing TextDocuments service and, e.g., wrapping the Map _syncedDocuments doesn't work (or wouldn't be more elegant) as everything is private or buried within inline function expressions, correct?

Yes, It'd be great if we could have something like a normalizeUri method in the TextDocumentsConfiguration interface, but right now, we need to create our own class due to the inline functions.

@sailingKieler
Copy link
Contributor

I'm however wondering what about uri references that are included when notifying the LC about diagnostics and the like, see

uri: uri.toString(),
diagnostics: []
});
}
});
documentBuilder.onDocumentPhase(DocumentState.Validated, async (document) => {
if (document.diagnostics) {
connection.sendDiagnostics({
uri: document.uri.toString(),

@msujew
Copy link
Member Author

msujew commented Sep 4, 2024

@sailingKieler I've mentioned this in the initial post:

Note that it seems like LSP4E can deal with our outgoing URIs just fine - it's the incoming URIs that are problematic.

Outgoing URIs are no problems, as the language client should parse the malformed URIs correctly - it's just that we cannot deal with "correctly" formatted URIs on the incoming side right now.

@sailingKieler
Copy link
Contributor

Note that it seems like LSP4E can deal with our outgoing URIs just fine - it's the incoming URIs that are problematic.

Outgoing URIs are no problems, as the language client should parse the malformed URIs correctly

That's an interesting assumption. 😅

I assume that VS Code implementations sitting at the beginning of the message pipeline of our LS (like Connection) are in charge of this specific encoding, I believe you that LSP4E and others are happy now.
I'm just wondering whether a Langium-based LS in VS Code on Windows still works as expected, as we now don't provide "other" uris in notifications than we got earlier.

@msujew
Copy link
Member Author

msujew commented Sep 4, 2024

I assume that VS Code implementations sitting at the beginning of the message pipeline of our LS (like Connection) are in charge of this specific encoding

They're not, the they just pass in whatever the language client gives them. They don't perform any manipulation on the LSP data. We do :)

I'm just wondering whether a Langium-based LS in VS Code on Windows still works as expected, as we now don't provide "other" uris in notifications than we got earlier.

Since I'm running Langium on Windows, I've naturally tested that it still works as expected. Note that this change is a no-op for VS Code, since the URIs we receive from VS Code are the exact same format as the one's we build ourselves using vscode-uri.

@msujew msujew force-pushed the msujew/normalize-uris branch from a356895 to 47ae6b8 Compare September 4, 2024 12:06
@sailingKieler
Copy link
Contributor

I think I got it now.

They're not, the they just pass in whatever the language client gives them. They don't perform any manipulation on the LSP data. We do :)

You're basically saying: Everything that happens in https://github.com/microsoft/vscode-languageserver-node/blob/f58f4dff16ad2760028bb1cb95e882de30a1000f/server/src/common/textDocuments.ts#L189-L244 relies on the plain uri strings provided by the language client, while we're doing

this.fireDocumentUpdate([URI.parse(change.document.uri)], []);
on first contact with a new document, and later on document.uri.toString(), e.g. in
const uriString = document.uri.toString();
if (this.documentMap.has(uriString)) {
throw new Error(`A document with the URI '${uriString}' is already present.`);
}
this.documentMap.set(uriString, document);
}
getDocument(uri: URI): LangiumDocument | undefined {
const uriString = uri.toString();

With your changes the book keeping in TextDocuments is in sync with what LangiumDocuments & friends do. 👍

@msujew
Copy link
Member Author

msujew commented Sep 4, 2024

@sailingKieler Exactly - and the main issue is essentially this:

const textDocument = this.textDocuments?.get(document.uri.toString());

We try to get the text document based on the langium URI - however, these don't match on Windows for a bunch of language clients. That's why we need to keep every URI formatted in the same way.

Note that the latest commit adds a few more utility methods to removes a few hacks inside our test framework.

@sailingKieler
Copy link
Contributor

Note that the latest commit adds a few more utility methods to removes a few hacks inside our test framework.

Had a look. Great that we can get rid of those workarounds at this occasion. 👌

@msujew msujew merged commit 27ab6b2 into main Sep 4, 2024
5 checks passed
@msujew msujew deleted the msujew/normalize-uris branch September 4, 2024 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LSP Language Server Protocol integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants